158 research outputs found
Policy iteration for perfect information stochastic mean payoff games with bounded first return times is strongly polynomial
Recent results of Ye and Hansen, Miltersen and Zwick show that policy
iteration for one or two player (perfect information) zero-sum stochastic
games, restricted to instances with a fixed discount rate, is strongly
polynomial. We show that policy iteration for mean-payoff zero-sum stochastic
games is also strongly polynomial when restricted to instances with bounded
first mean return time to a given state. The proof is based on methods of
nonlinear Perron-Frobenius theory, allowing us to reduce the mean-payoff
problem to a discounted problem with state dependent discount rate. Our
analysis also shows that policy iteration remains strongly polynomial for
discounted problems in which the discount rate can be state dependent (and even
negative) at certain states, provided that the spectral radii of the
nonnegative matrices associated to all strategies are bounded from above by a
fixed constant strictly less than 1.Comment: 17 page
Spectral Theorem for Convex Monotone Homogeneous Maps, and Ergodic Control
We consider convex maps f:R^n -> R^n that are monotone (i.e., that preserve
the product ordering of R^n), and nonexpansive for the sup-norm. This includes
convex monotone maps that are additively homogeneous (i.e., that commute with
the addition of constants). We show that the fixed point set of f, when it is
non-empty, is isomorphic to a convex inf-subsemilattice of R^n, whose dimension
is at most equal to the number of strongly connected components of a critical
graph defined from the tangent affine maps of f. This yields in particular an
uniqueness result for the bias vector of ergodic control problems. This
generalizes results obtained previously by Lanery, Romanovsky, and Schweitzer
and Federgruen, for ergodic control problems with finite state and action
spaces, which correspond to the special case of piecewise affine maps f. We
also show that the length of periodic orbits of f is bounded by the cyclicity
of its critical graph, which implies that the possible orbit lengths of f are
exactly the orders of elements of the symmetric group on n letters.Comment: 38 pages, 13 Postscript figure
The Operator Approach to Entropy Games
Entropy games and matrix multiplication games have been recently introduced by Asarin et al. They model the situation in which one player (Despot) wishes to minimize the growth rate of a matrix product, whereas the other player (Tribune) wishes to maximize it. We develop an operator approach to entropy games. This allows us to show that entropy games can be cast as stochastic mean payoff games in which some action spaces are simplices and payments are given by a relative entropy (Kullback-Leibler divergence). In this way, we show that entropy games with a fixed number of states belonging to Despot can be solved in polynomial time. This approach also allows us to solve these games by a policy iteration algorithm, which we compare with the spectral simplex algorithm developed by Protasov
Tropical Cramer Determinants Revisited
We prove general Cramer type theorems for linear systems over various
extensions of the tropical semiring, in which tropical numbers are enriched
with an information of multiplicity, sign, or argument. We obtain existence or
uniqueness results, which extend or refine earlier results of Gondran and
Minoux (1978), Plus (1990), Gaubert (1992), Richter-Gebert, Sturmfels and
Theobald (2005) and Izhakian and Rowen (2009). Computational issues are also
discussed; in particular, some of our proofs lead to Jacobi and Gauss-Seidel
type algorithms to solve linear systems in suitably extended tropical
semirings.Comment: 41 pages, 5 Figure
The max-plus Martin boundary
We develop an idempotent version of probabilistic potential theory. The goal
is to describe the set of max-plus harmonic functions, which give the
stationary solutions of deterministic optimal control problems with additive
reward. The analogue of the Martin compactification is seen to be a
generalisation of the compactification of metric spaces using (generalised)
Busemann functions. We define an analogue of the minimal Martin boundary and
show that it can be identified with the set of limits of ``almost-geodesics'',
and also the set of (normalised) harmonic functions that are extremal in the
max-plus sense. Our main result is a max-plus analogue of the Martin
representation theorem, which represents harmonic functions by measures
supported on the minimal Martin boundary. We illustrate it by computing the
eigenvectors of a class of translation invariant Lax-Oleinik semigroups. In
this case, we relate the extremal eigenvectors to the Busemann points of a
normed space.Comment: 37 pages; 8 figures v1: December 20, 2004. v2: June 7, 2005. Section
12 adde
Hypergraph conditions for the solvability of the ergodic equation for zero-sum games
The ergodic equation is a basic tool in the study of mean-payoff stochastic
games. Its solvability entails that the mean payoff is independent of the
initial state. Moreover, optimal stationary strategies are readily obtained
from its solution. In this paper, we give a general sufficient condition for
the solvability of the ergodic equation, for a game with finite state space but
arbitrary action spaces. This condition involves a pair of directed hypergraphs
depending only on the ``growth at infinity'' of the Shapley operator of the
game. This refines a recent result of the authors which only applied to games
with bounded payments, as well as earlier nonlinear fixed point results for
order preserving maps, involving graph conditions.Comment: 6 pages, 1 figure, to appear in Proc. 54th IEEE Conference on
Decision and Control (CDC 2015
How to find horizon-independent optimal strategies leading off to infinity: a max-plus approach
A general problem in optimal control consists of finding a terminal reward
that makes the value function independent of the horizon. Such a terminal
reward can be interpreted as a max-plus eigenvector of the associated
Lax-Oleinik semigroup. We give a representation formula for all these
eigenvectors, which applies to optimal control problems in which the state
space is non compact. This representation involves an abstract boundary of the
state space, which extends the boundary of metric spaces defined in terms of
Busemann functions (the horoboundary). Extremal generators of the eigenspace
correspond to certain boundary points, which are the limit of almost-geodesics.
We illustrate our results in the case of a linear quadratic problem.Comment: 13 pages, 5 figures, To appear in Proc. 45th IEEE Conference on
Decision and Contro
Log-majorization of the moduli of the eigenvalues of a matrix polynomial by tropical roots
We show that the sequence of moduli of the eigenvalues of a matrix polynomial
is log-majorized, up to universal constants, by a sequence of "tropical roots"
depending only on the norms of the matrix coefficients. These tropical roots
are the non-differentiability points of an auxiliary tropical polynomial, or
equivalently, the opposites of the slopes of its Newton polygon. This extends
to the case of matrix polynomials some bounds obtained by Hadamard, Ostrowski
and P\'olya for the roots of scalar polynomials. We also obtain new bounds in
the scalar case, which are accurate for "fewnomials" or when the tropical roots
are well separated.Comment: 36 pages, 19 figure
- …